SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Yang Wang, Michael W. Spellman 
(ii) TITLE OF INVENTION: O-Fucosyltransf erase 
(iii) NUMBER OF SEQUENCES: 17 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Genentech, Inc. 

(B) STREET: 1 DNA Way 

(C) CITY: South San Francisco 

(D) STATE: California 

(E) COUNTRY: USA 

(F) ZIP: 94080 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 3.5 inch, 1.44 Mb floppy di 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: WinPatin (Genentech) 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: Unassigned 

(B) FILING DATE: 26-Nov-1997 

( C ) C LAS S I F I CAT I ON : 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/792498 

(B) FILING DATE: 31 January 1997 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Svoboda, Craig G- 

(B) REGISTRATION NUMBER: 39,044 

(C) REFERENCE /DOCKET NUMBER: PI 041 PI 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 650/225-1489 

(B) TELEFAX: 650/952-9881 
(2) INFORMATION FOR SEQ ID NO : 1 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1514 base pairs 

(B) TYPE: Nucleic Acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



ATGCCCGCGG GCTCCTGGGA CCCGGCCGGT TACCTGCTCT ACTGCCCCTG 50 
CATGGGGCGC TTTGGGAACC AGGCCGATCA CTTCTTGGGC TCTCTGGCAT 100 
TTGCAAAGCT GCTAAACCGT ACCTTGGCTG TCCCTCCTTG GATTGAGTAC 150 
CAGCATCACA AGCCTCCTTT CACCAACCTC CATGTGTCCT AC C AG AAGTA 200 
CTTCAAGCTG GAGCCCCTCC AGGCTTACCA TCGGGTCATC AGCTTGGAGG 250 
ATTTCATGGA GAAGCTGGCA CCCACCCACT GGCCCCCTGA GAAGCGGGTG 300 
GCATACTGCT TTGAGGTGGC AGCCCAGCGA AGCCCAGATA AGAAGACGTG 350 
CCCCATGAAG GAAGGAAACC CCTTTGGCCC ATTCTGGGAT CAGTTTCATG 400 
TGAGTTTCAA CAAGTCGGAG CTTTTTACAG GCATTTCCTT CAGTGCTTCC 450 
TACAGAGAAC AATGGAGCCA GAGATTTTCT CCAAAGGAAC ATCCGGTGCT 500 
TGCCCTGCCA GGAGCCCCAG CCCAGTTCCC CGTCCTAGAA GAACACAGGC 550 
CACTACAGAA GTACATGGTA TGGTCAGACG AAATGGTGAA GACGGGAGAG 600 
GCCCAGATTC ATGCCCACCT TGTCCGGCCC TATGTGGGCA TTCATCTGCG 650 
CATTGGCTCT GACTGGAAGA ACGCCTGTGC CATGCTGAAG GACGGGACTG 700 
CAGGCTCGCA CTTCATGGCC TCTCCGCAGT GTGTGGGCTA CAGCCGCAGC 750 
ACAGCGGCCC CCCTCACGAT GACTATGTGC CTGCCTGACC TGAAGGAGAT 800 
CCAGAGGGCT GTGAAGCTCT GGGTGAGGTC GCTGGATGCC CAGTCGGTCT 850 



ACGTTGCTAC TGATTCCGAG AGTTATGTGC CTGAGCTCCA ACAGCTCTTC 900 
AAAGGGAAGG TGAAGGTGGT GAGCCTGAAG CCTGAGGTGG CCCAGGTCGA 950 
CCTGTACATC CTCGGCCAAG CCGACCACTT TATTGGCAAC TGTGTCTCCT 1000 
CCTTCACTGC CTTTGTGAAG CGGGAGCGGG ACCTCCAGGG GAGGCCGTCT 1050 
TCTTTCTTCG GCATGGACAG GCCCCCTAAG CTGCGGGACG AGTTCTGATT 1100 
CTGGCCGGAG CACCAGACCC TCTGATCCTG GAGGGACCAG AGTCTGAGCT 1150 
GGTCCTTCCA GCCAGGCCTG GCAGCCAGAG GTGCTCCGGG ATTGCAAACT 1200 
CCTCTTCTCA CCTGCCAAAG ATGGAGAAGA GTGCCAGGGA CCCCTCAAGG 1250 
AGGGAGACGC TCCATATCCC AGGGCATAGG ACTTGCAGGT TCCTAGGAGC 13 0 0 
AGGAGCATCT CCCATCGCAC GTGCTTTCTG CTCTTCTGGG AATTTCTCAC 13 50 
ACTGGCAAAG CAGTCCAGCC TCCGTCTTCT GGTCCACTCT GCTCTGAGCA 14 00 
GCCTGGGATG CTGAACTCTT CAGAGAGATT TTTTTATAGA GAGATTTCTA 14 5 0 
TAATTTTGAT ACAAGGTCAT GACTATCCTA GAACTCTCTG TGGTTTTTGA 15 0 0 
AAATCATTGA ATTC 1514 
(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 65 amino acids 

(B) TYPE: Amino Acid 
(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Pro Ala Gly Ser Trp Asp Pro Ala Gly Tyr Leu Leu Tyr Cys 
15 10 15 

Pro Cys Met Gly Arg Phe Gly Asn Gin Ala Asp His Phe Leu Gly 

20 25 30 
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Ser Leu Ala Phe Ala Lys Leu Leu Asn Arg Thr Leu Ala Val Pro 

35 40 45 



Pro Trp lie Glu Tyr Gin His His Lys Pro Pro Phe Thr Asn Leu 

50 55 60 

His Val Ser Tyr Gin Lys Tyr Phe Lys Leu Glu Pro Leu Gin Ala 

65 70 75 

Tyr His Arg Val lie Ser Leu Glu Asp Phe Met Glu Lys Leu Ala 

80 85 90 

Pro Thr His Trp Pro Pro Glu Lys Arg Val Ala Tyr Cys Phe Glu 

95 100 105 

Val Ala Ala Gin Arg Ser Pro Asp Lys Lys Thr Cys Pro Met Lys 

110 115 120 

Glu Gly Asn Pro Phe Gly Pro Phe Trp Asp Gin Phe His Val Ser 

125 130 135 

Phe Asn Lys Ser Glu Leu Phe Thr Gly He Ser Phe Ser Ala Ser 

140 145 150 

Tyr Arg Glu Gin Trp Ser Gin Arg Phe Ser Pro Lys Glu His Pro 

155 160 165 

Val Leu Ala Leu Pro Gly Ala Pro Ala Gin Phe Pro Val Leu Glu 

170 175 180 

Glu His Arg Pro Leu Gin Lys Tyr Met Val Trp Ser Asp Glu Met 

185 190 195 

Val Lys Thr Gly Glu Ala Gin He His Ala His Leu Val Arg Pro 

200 205 210 

Tyr Val Gly He His Leu Arg He Gly Ser Asp Trp Lys Asn Ala 

215 220 225 

Cys Ala Met Leu Lys Asp Gly Thr Ala Gly Ser His Phe Met Ala 

230 235 240 
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Ser Pro Gin Cys Val Gly Tyr Ser Arg Ser Thr Ala Ala Pro Leu 

245 250 255 



Thr Met Thr Met Cys Leu Pro Asp Leu Lys Glu lie Gin Arg Ala 

260 265 270 

Val Lys Leu Trp Val Arg Ser Leu Asp Ala Gin Ser Val Tyr Val 

275 280 285 

Ala Thr Asp Ser Glu Ser Tyr Val Pro Glu Leu Gin Gin Leu Phe 

290 295 300 

Lys Gly Lys Val Lys Val Val Ser Leu Lys Pro Glu Val Ala Gin 

305 310 315 

Val Asp Leu Tyr He Leu Gly Gin Ala Asp His Phe He Gly Asn 

320 325 330 

Cys Val Ser Ser Phe Thr Ala Phe Val Lys Arg Glu Arg Asp Leu 

335 340 345 

Gin Gly Arg Pro Ser Ser Phe Phe Gly Met Asp Arg Pro Pro Lys 

350 355 360 

Leu Arg Asp Glu Phe 

365 

(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 amino acids 

(B) TYPE: Amino Acid 
(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

Arg Leu Ala Gly Ser Trp Asp Leu Ala Gly Tyr Leu Leu Tyr Xaa 
15 10 15 

Pro Xaa Met Gly Arg Phe Gly Asn Gin Ala Asp His Phe Leu Gly 

20 25 30 
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Ser Leu Ala Phe Ala Lys Leu Xaa Val Arg Thr Leu Ala Val Pro 

35 40 45 

Pro Trp He Glu Tyr Gin His His Lys Pro Pro Phe Thr Asn Leu 

50 55 60 

His 
61 

(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1300 base pairs 

(B) TYPE: Nucleic Acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 



TTATTCATAC CGTCCCACCA TCGGGCGCGG ATCAGATCCA TGGCCAAGTT 50 
CCTGGTCAAC GTGGCCCTGC TGCTGCTGCT GCTGCTGCTG TCCGGAGCCT 100 
GGGCCCATAT GAGATCCCAT CACCATCACC ATCACATGCC CGCGGGCTCC 150 
TGGGACCCGG CCGGTTACCT GCTCTACTGC CCCTGCATGG GGCGCTTTGG 200 
GAAC CAGGCC GATCACTTCT TGGGCTCTCT GGCATTTGCA AAGCTGCTAA 2 50 
ACCGTACCTT GGCTGTCCCT CCTTGGATTG AGTACCAGCA TCACAAGCCT 3 00 
CCTTTCACCA ACCTCCATGT GTCCTAC CAG AAGTACTTCA AGCTGGAGCC 3 50 
CCTCCAGGCT TACCATCGGG TCATCAGCTT GGAGGATTTC ATGGAGAAGC 400 
TGGCACCCAC CCACTGGCCC CCTGAGAAGC GGGTGGCATA CTGCTTTGAG 450 
GTGGCAGCCC AGCGAAGCCC AGATAAGAAG ACGTGCCCCA TGAAGGAAGG 500 
AAACCCCTTT GGCCCATTCT GGGATCAGTT TCATGTGAGT TTCAACAAGT 550 
CGGAGCTTTT TACAGGCATT TCCTTCAGTG CTTCCTACAG AGAACAATGG 600 
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AGCCAGAGAT TTTCTCCAAA GGAACATCCG GTGCTTGCCC TGCCAGGAGC 650 
CCCAGCCCAG TTCCCCGTCC TAGAGGAACA CAGGCCACTA CAGAAGTACA 700 
TGGTATGGTC AGACGAAATG GTGAAGACGG GAGAGGCCCA GATTCATGCC 750 
CACCTTGTCC GGCCCTATGT GGGCATTCAT CTGCGCATTG GCTCTGACTG 800 
GAAGAACGCC TGTGCCATGC TGAAGGACGG GACTGCAGGC TCGCACTTCA 850 
TGGCCTCTCC GCAGTGTGTG GGCTACAGCC GCAGCACAGC GGCCCCCCTC 900 
ACGATGACTA TGTGCCTGCC TGACCTGAAG GAGATCCAGA GGGCTGTGAA 950 
GCTCTGGGTG AGGTCGCTGG ATGCCCAGTC GGTCTACGTT GCTACTGATT 1000 
CCGAGAGTTA TGTGCCTGAG CTCCAACAGC TCTTCAAAGG GAAGGTGAAG 1050 
GTGGTGAGCC TGAAGCCTGA GGTGGCCCAG GTCGACCTGT ACATCCTCGG 1100 
CCAAGCCGAC CACTTTATTG GCAACTGTGT CTCCTCCTTC ACTGCCTTTG 1150 
TGAAGCGGGA GCGGGACCTC CAGGGGAGGC CGTCTTCTTT CTTCGGCATG 1200 
GACAGGCCCC CTAAGCTGCG GGACGAGTTC TGATTCTGGC CGGAGCACCA 1250 
GACCCTCTGA TCCTGGAGGG ACCAGAGTCT GAGCTGGTCC TTCCAGCCAG 13 00 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 112 84 base pairs 

(B) TYPE: Nucleic Acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

AAGCTTTACT CGTAAAGCGA GTTGAAGGAT CATATTTAGT TGCGTTTATG 5 0 
AGATAAGATT GAAAGCACGT GTAAAATGTT TCCCGCGCGT TGGCACAACT 100 
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ATTTACAATG CGGCCAAGTT ATAAAAGATT CTAATCTGAT ATGTTTTAAA 150 
ACACCTTTGC GGCCCGAGTT GTTTGCGTAC GTGACTAGCG AAGAAGATGT 200 
GTGGACCGCA GAACAGATAG TAAAACAAAA CCCTAGTATT GGAGCAATAA 250 
TCGATTTAAC CAACACGTCT AAATATTATG ATGGTGTGCA TTTTTTGCGG 3 00 
GCGGGCCTGT TATACAAAAA AATTCAAGTA CCTGGCCAGA CTTTGCCGCC 350 
TGAAAGCATA GTTCAAGAAT TTATTGACAC GGTAAAAGAA TTTACAGAAA 400 
AGTGTCCCGG CATGTTGGTG GGCGTGCACT GCACACACGG TATTAATCGC 450 
ACCGGTTACA TGGTGTGCAG ATATTTAATG CACACCCTGG GTATTGCGCC 500 
GCAGGAAGCC ATAGATAGAT TCGAAAAAGC CAGAGGTCAC AAAATTGAAA 550 
GACAAAATTA CGTTCAAGAT TTATTAATTT AATTAATATT ATTTGCATTC 60 0 
TTTAACAAAT ACTTTATCCT ATTTTCAAAT TGTTGCGCTT CTTCCAGCGA 6 50 
ACCAAAACTA TGCTTCGCTT GCTCCGTTTA GCTTGTAGCC GATCAGTGGC 700 
GTTGTTCCAA TCGACGGTAG GATTAGGCCG GATATTCTCC ACCACAATGT 750 
TGGCAACGTT GATGTTACGT TTATGCTTTT GGTTTTCCAC GTACGTCTTT 80 0 
TGGCCGGTAA TAGCCGTAAA CGTAGTGCCG TCGCGCGTCA CGCACAACAC 85 0 
CGGATGTTTG CGCTTGTCCG CGGGGTATTG AACCGCGCGA TCCGACAAAT 900 
CCACCACTTT GGCAACTAAA TCGGTGACCT GCGCGTCTTT TTTCTGCATT 950 
ATTTCGTCTT TCTTTTGCAT GGTTTCCTGG AAGCCGGTGT ACATGCGGTT 10 0 0 
TAGATCAGTC ATGACGCGCG TGACCTGCAA ATCTTTGGCC TCGATCTGCT 1050 
TGTCCTTGAT GGCAACGATG CGTTCAATAA ACTCTTGTTT TTTAACAAGT 110 0 
TCCTCGGTTT TTTGCGCCAC CACCGCTTGC AGCGCGTTTG TGTGCTCGGT 115 0 
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GAATGTCGCA ATCAGCTTAG TCACCAACTG TTTGCTCTCC TCCTCCCGTT 1200 
GTTTGATCGC GGGATCGTAC TTGCCGGTGC AGAGCACTTG AGGAATTACT 1250 
TCTTCTAAAA GCCATTCTTG TAATTCTATG GCGTAAGGCA ATTTGGACTT 13 00 
CATAATCAGC TGAATCACGC CGGATTTAGT AATGAGCACT GTATGCGGCT 1350 
GCAAATACAG CGGGTCGCCC CTTTTCACGA CGCTGTTAGA GGTAGGGCCC 1400 
CCATTTTGGA TGGTCTGCTC AAATAACGAT TTGTATTTAT TGTCTACATG 14 50 
AACACGTATA GCTTTATCAC AAACTGTATA TTTTAAACTG TTAGCGACGT 1500 
CCTTGGCCAC GAAC CGGACC TGTTGGTCGC GCTCTAGCAC GTACCGCAGG 1550 
TTGAACGTAT CTTCTCCAAA TTTAAATTCT CCAATTTTAA CGCGAGCCAT 1600 
TTTGATACAC GTGTGTCGAT TTTGCAACAA CTATTGTTTT TTAACGCAAA 1650 
CTAAACTTAT TGTGGTAAGC AATAATTAAA TATGGGGGAA CATGCGCCGC 1700 
TACAACACTC GTCGTTATGA ACGCAGACGG CGCCGGTCTC GGCGCAAGCG 1750 
GCTAAAACGT GTTGCGCGTT CAACGCGGCA AAC AT CGCAA AAGCCAATAG 180 0 
TACAGTTTTG ATTTGCATAT TAACGGCGAT TTTTTAAATT ATCTTATTTA 185 0 
ATAAATAGTT ATGACGCCTA CAACTCCCCG CCCGCGTTGA CTCGCTGCAC 190 0 
CTCGAGCAGT TCGTTGACGC CTTCCTCCGT GTGGCCGAAC ACGTCGAGCG 1950 
GGTGGTCGAT GACCAGCGGC GTGCCGCACG CGACGCACAA GTATCTGTAC 2 000 
ACCGAATGAT CGTCGGGCGA AGGCACGTCG GCCTCCAAGT GGCAATATTG 2 0 50 
GCAAATTCGA AAATATATAC AGTTGGGTTG TTTGCGCATA TCTATCGTGG 2100 
CGTTGGGCAT GTACGT CCGA ACGTTGATTT GCATGCAAGC CGAAATTAAA 2150 
TCATTGCGAT TAGTGCGATT AAAACGTTGT ACATCCTCGC TTTTAATCAT 2200 
GCCGTCGATT AAATCGCGCA ATCGAGTCAA GTGATCAAAG TGTGGAATAA 2250 
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TGTTTTCTTT GTATTCCCGA 
GCCATCTTGT AAGTTAGTTT 
TATGTATCGC ACGTCAAGAA 
ACGACTATGA TAGAGATCAA 
AACGTGCACG ATCTGTGCAC 
GTTTTTACGA AGCGATGACA 
AAAGAACTGC CGACTACAAA 
ATTAAGCCAT CCAATCGACC 
AGCCGCGAAG TATGGCGAAT 
AGAGCGT CAT GTTTAGACAA 
TTTTATTGAT AAATTGACCC 
GGGTTTTGGT CAAAATTTCC 
CCGCCCACTA TTAATGAAAT 
GAGAAACATT TGTATGAAAG 
ACATGCTGAA CAACAAGATT 
AACGATTTGA AAGAAAACAA 
GTTTATACTA AACTGTTACA 
AAAACCGATG TTTAATCAAG 
AAGTGTGTGG GTGAAGTCAT 
TAAACCACCA AACTGCCAAA 
TTGCTGGCAA CTGCAAGGGT 



GTCAAGCGCA GCGCGTATTT TAACAAACTA 2300 
CATTTAATGC AACTTTATCC AATAATATAT 2350 
TTAACAATGC GCCCGTTGTC GCATCTCAAC 2400 
ATAAAGCGCG AATTAAATAG CTTGCGACGC 2450 
GCGTTCCGGC ACGAGCTTTG ATTGTAATAA 2500 
TGACCCCCGT AGTGACAACG ATCACGCCCA 2550 
ATTACCGAGT ATGTCGGTGA CGTTAAAACT 2600 
GTTAGTCGAA TCAGGACCGC TGGTGCGAGA 2650 
GCATCGTATA ACGTGTGGAG TCCGCTCATT 270 0 
GAAAGCTACA TATTTAATTG ATCCCGATGA 2750 
TAACTCCATA CACGGTATTC TACAATGGCG 2800 
GGACTGCGAT TGTACATGCT GTTAACGGCT 2850 
TAAAAATTCC AATTTTAAAA AACGCAGCAA 2900 
AATGCGTAGA AGGAAAGAAA AATGTCGTCG 2950 
AATATGCCTC CGTGTATAAA AAAAATATTG 3 000 
TGTACCGCGC GGCGGTATGT ACAGGAAGAG 3050 
TTGCAAACGT GGTTTCGTGT GCCAAGTGTG 3100 
GCTCTGACGC ATTTCTACAA CCACGACTCC 3150 
GCATCTTTTA ATCAAATCCC AAGATGTGTA 3200 
AAATGAAAAC TGTCGACAAG CTCTGTCCGT 3250 
CTCAATCCTA TTTGTAATTA TTGAATAATA 33 00 
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AAACAATTAT AAATGCTAAA TTTGTTTTTT ATTAACGATA CAAACCAAAC 3350 
GCAACAAGAA CATTTGTAGT ATTATCTATA ATTGAAAACG CGTAGTTATA 3400 
ATCGCTGAGG TAATATTTAA AATCATTTTC AAATGATTCA CAGTTAATTT 3450 
GCGACAATAT AATTTTATTT TCACATAAAC TAGACGCCTT GTCGTCTTCT 3500 
TCTTCGTATT CCTTCTCTTT TTCATTTTT C TCCTCATAAA AATTAACATA 3550 
GTTATTATCG TATCCATATA TGTATCTATC GTATAGAGTA AATTTTTTGT 3600 
TGTCATAAAT ATATATGTCT TTTTTAATGG GGTGTATAGT ACCGCTGCGC 3650 
ATAGTTTTTC TGTAATTTAC AACAGTGCTA TTTTCTGGTA GTTCTTCGGA 3700 
GTGTGTTGCT TTAATTATTA AATTTATATA ATCAATGAAT TTGGGATCGT 3750 
CGGTTTTGTA CAATATGTTG CCGGCATAGT ACGCAGCTTC TTCTAGTTCA 3 800 
ATTACACCAT TTTTTAGCAG CACCGGATTA ACATAACTTT CCAAAATGTT 3850 
GTACGAACCG TTAAACAAAA ACAGTTCACC TCCCTTTTCT ATACTATTGT 3 90 0 
CTGCGAGCAG TTGTTTGTTG TTAAAAATAA CAGCCATTGT AATGAGACGC 3 95 0 
ACAAACTAAT AT CACAAACT GGAAATGTCT ATCAATATAT AGTTGCTGAT 4 000 
ATCATGGAGA TAATTAAAAT GATAACCATC TCGCAAATAA ATAAGTATTT 4 05 0 
TACTGTTTTC GTAACAGTTT TGTAATAAAA AAACCTATAA ATATTCCGGA 4100 
TTATTCATAC CGTCCCACCA TCGGGCGCGG ATCAGATCCA TGGCCAAGTT 4150 
CCTGGTCAAC GTGGCCCTGC TGCTGCTGCT GCTGCTGCTG TCCGGAGCCT 420 0 
GGGCCCATAT GAGATCCCAT CACCATCACC ATCACATGCC CGCGGGCTCC 4250 
TGGGACCCGG CCGGTTACCT GCTCTACTGC CCCTGCATGG GGCGCTTTGG 4300 
GAACCAGGCC GATCACTTCT TGGGCTCTCT GGCATTTGCA AAGCTGCTAA 4350 
ACCGTACCTT GGCTGTCCCT CCTTGGATTG AGTACCAGCA TCACAAGCCT 4400 
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CCTTTCACCA ACCTCCATGT GTCCTACCAG AAGTACTTCA AGCTGGAGCC 4450 
CCTCCAGGCT TACCATCGGG TCATCAGCTT GGAGGATTTC ATGGAGAAGC 4500 
TGGCACCCAC CCACTGGCCC CCTGAGAAGC GGGTGGCATA CTGCTTTGAG 455 0 
GTGGCAGCCC AGCGAAGCCC AGATAAGAAG ACGTGCCCCA TGAAGGAAGG 4600 
AAACCCCTTT GGCCCATTCT GGGATCAGTT TCATGTGAGT TTCAACAAGT 4 650 
CGGAGCTTTT TACAGGCATT TCCTTCAGTG CTTCCTACAG AGAACAATGG 4700 
AGCCAGAGAT TTTCTCCAAA GGAACATCCG GTGCTTGCCC TGCCAGGAGC 4750 
CCCAGCCCAG TTCCCCGTCC TAGAGGAACA CAGGCCACTA CAGAAGTACA 4800 
TGGTATGGTC AGACGAAATG GTGAAGACGG GAGAGGCCCA GATTCATGCC 4850 
CACCTTGTCC GGCCCTATGT GGGCATTCAT CTGCGCATTG GCTCTGACTG 4 9 00 
GAAGAACGCC TGTGCCATGC TGAAGGACGG GACTGCAGGC TCGCACTTCA 4 950 
TGGCCTCTCC GCAGTGTGTG GGCTACAGCC GCAGCACAGC GGCCCCCCTC 5000 
ACGATGACTA TGTGCCTGCC TGACCTGAAG GAGATCCAGA GGGCTGTGAA 5 05 0 
GCTCTGGGTG AGGTCGCTGG ATGCCCAGTC GGTCTACGTT GCTACTGATT 510 0 
CCGAGAGTTA TGTGCCTGAG CTCCAACAGC TCTTCAAAGG GAAGGTGAAG 5150 
GTGGTGAGCC TGAAGCCTGA GGTGGCCCAG GTCGACCTGT ACATCCTCGG 52 0 0 
CCAAGCCGAC CACTTTATTG GCAACTGTGT CTCCTCCTTC ACTGCCTTTG 52 5 0 
TGAAGCGGGA GCGGGACCTC CAGGGGAGGC CGTCTTCTTT CTTCGGCATG 53 0 0 
GACAGGCCCC CTAAGCTGCG GGACGAGTTC TGATTCTGGC CGGAGCACCA 53 5 0 
GACCCTCTGA TCCTGGAGGG ACCAGAGTCT GAGCTGGTCC TTCCAGCCAG 54 0 0 
GCCTGGCAGC CAGAGGTGCT CCGGGATTGC AAACTCCTCT TCTCACCTGC 54 5 0 
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CAAAGATGGA GAAGAGTGCC AGGGACCCCT CAAGGAGGGA GACGCTCCAT 5500 
ATCCCAGGGC ATAGGACTTG CAGGTTCCTA GGAGCAGGAG CATCTCCCAT 5550 
CGCACGTGCT TTCTGCTCTT CTGGGAATTT CTCACACTGG CAAAGCAGTC 5600 
CAGCCTCCGT CTTCTGGTCC ACTCTGCTCT GAGCAGCCTG GGATGCTGAA 5650 
CTCTT CAGAG AGATTTTTTT ATAGAGAGAT TTCTATAATT TTGATACAAG 5700 
GTCATGACTA TCCTAGAACT CTCTGTGGTT TTTGAAAATC ATTGAATTCC 5750 
TGCAGCCCGG GGGATCCACT AGTTCTAGTT CTAGAGCGGC CGCTCCAGAA 5800 
TTCTAGAAGG TACCCGGGAT CCTTTCCTGG GACCCGGCAA GAACCAAAAA 5850 
CTCACTCTCT TCAAGGAAAT CCGTAATGTT AAACCCGACA CGATGAAGCT 5900 
TGTCGTTGGA TGGAAAGGAA AAGAGTTCTA CAGGGAAACT TGGACCCGCT 5950 
TCATGGAAGA CAGCTTCCCC ATTGTTAACG ACCAAGAAGT GATGGATGTT 6000 
TTCCTTGTTG TCAACATGCG TCCCACTAGA CCCAACCGTT GTTACAAATT 6 050 
CCTGGCCCAA CACGCTCTGC GTTGCGACCC CGACTATGTA CCTCATGACG 6100 
TGATTAGGAT CGTCGAGCCT TCATGGGTGG GCAGCAACAA CGAGTACCGC 6150 
ATCAGCCTGG CTAAGAAGGG CGGCGGCTGC CCAATAATGA ACCTTCACTC 620 0 
TGAGTAC AC C AACTCGTTCG AACAGTTCAT CGATCGTGTC ATCTGGGAGA 6250 
ACTTCTACAA GCCCATCGTT TACATCGGTA CCGACTCTGC TGAAGAGGAG 6300 
GAAATTCTCC TTGAAGTTTC CCTGGTGTTC AAAGTAAAGG AGTTTGCACC 6 35 0 
AGACGCACCT CTGTTCACTG GTCCGGCGTA TTAAAACACG ATACATTGTT 6400 
ATTAGTACAT TTATTAAGCG CTAGATT CTG TGCGTTGTTG ATTTACAGAC 6450 
AATTGTTGTA CGTATTTTAA TAATTCATTA AATTTATAAT CTTTAGGGTG 650 0 
GTATGTTAGA GCGAAAATCA AATGATTTTC AGCGTCTTTA TATCTGAATT 6550 
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TAAATATTAA ATCCTCAATA GATTTGTAAA ATAGGTTTCG ATTAGTTTCA 6600 
AACAAGGGTT GTTTTTC CGA ACCGATGGCT GGACTATCTA ATGGATTTTC 6650 
GCTCAACGCC ACAAAACTTG CCAAATCTTG TAGCAGCAAT CTAGCTTTGT 6700 
CGATATTCGT TTGTGTTTTG TTTTGTAATA AAGGTT CGAC GTCGTTCAAA 6750 
ATATTATGCG CTTTTGTATT TCTTTCATCA CTGTCGTTAG TGTACAATTG 6800 
ACTCGACGTA AACACGTTAA ATAAAGCTTG GACATATTTA ACATCGGGCG 6850 
TGTTAGCTTT ATTAGGCCGA TTATCGTCGT CGTCCCAACC CTCGTCGTTA 6900 
GAAGTTGCTT C CGAAGACG A TTTTGCCATA GCCACACGAC GCCTATTAAT 6950 
TGTGTCGGCT AACACGTCCG CGATCAAATT TGTAGTTGAG CTTTTTGGAA 7000 
TTATTTCTGA TTGCGGGCGT TTTTGGGCGG GTTTCAATCT AACTGTGCCC 7050 
GATTTTAATT CAGACAACAC GTTAGAAAGC GATGGTGCAG GCGGTGGTAA 7100 
CATTTCAGAC GGCAAATCTA CTAATGGCGG CGGTGGTGGA GCTGATGATA 7150 
AATCTACCAT CGGTGGAGGC GCAGGCGGGG CTGGCGGCGG AGGCGGAGGC 72 00 
GGAGGTGGTG GCGGTGATGC AGACGGCGGT TTAGGCTCAA ATGTCTCTTT 7250 
AGGCAACACA GTCGGCACCT CAACTATTGT ACTGGTTTCG GGCGCCGTTT 73 0 0 
TTGGTTTGAC CGGTCTGAGA CGAGTGCGAT TTTTTTCGTT TCTAATAGCT 73 50 
T CCAACAATT GTTGTCTGTC GTCTAAAGGT GCAGCGGGTT GAGGTTCCGT 74 00 
CGGCATTGGT GGAGCGGGCG GCAATTCAGA CATCGATGGT GGTGGTGGTG 74 50 
GTGGAGGCGC TGGAATGTTA GGCACGGGAG AAGGTGGTGG CGGCGGTGCC 7500 
GCCGGTATAA TTTGTTCTGG TTTAGTTTGT TCGCGCACGA TTGTGGGCAC 7550 
CGGCGCAGGC GCCGCTGGCT GCACAACGGA AGGTCGTCTG CTTCGAGGCA 7600 
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GCGCTTGGGG TGGTGGCAAT TCAATATTAT AATTGGAATA CAAATCGTAA 7650 
AAATCTGCTA TAAGCATTGT AATTTCGCTA TCGTTTACCG TGCCGATATT 7700 
TAACAACCGC TCAATGTAAG CAATTGTATT GTAAAGAGAT TGTCTCAAGC 7750 
TCCGCACGCC GATAACAAGC CTTTTCATTT TTACTACAGC ATTGTAGTGG 7800 
CGAGACACTT CGCTGTCGTC GACGTACATG TATGCTTTGT TGTCAAAAAC 7850 
GTCGTTGGCA AGCTTTAAAA TATTTAAAAG AACATCTCTG TTCAGCACCA 7900 
CTGTGTTGTC GTAAATGTTG TTTTTGATAA TTTGCGCTTC CGCAGTATCG 7950 
ACACGTTCAA AAAATTGATG CGCATCAATT TTGTTGTTCC TATTATTGAA 8000 
TAAATAAGAT TGTACAGATT CATATCTACG ATTCGTCATG GCCACCACAA 8050 
ATGCTACGCT GCAAACGCTG GTACAATTTT ACGAAAACTG CAAAAACGTC 8100 
AAAACTCGGT ATAAAATAAT CAACGGGCGC TTTGGCAAAA TATCTATTTT 8150 
ATCGCACAAG CCCACTAGCA AATTGTATTT GCAGAAAACA ATTTCGGCGC 82 0 0 
ACAATTTTAA CGCTGACGAA ATAAAAGTTC ACCAGTTAAT GAGCGACCAC 8250 
CCAAATTTTA TAAAAATCTA TTTTAATCAC GGTTCCATCA ACAACCAAGT 83 00 
GATCGTGATG GACTACATTG ACTGTCCCGA TTTATTTGAA ACACTACAAA 83 50 
TTAAAGGCGA GCTTTCGTAC CAACTTGTTA GCAATATTAT TAGACAGCTG 840 0 
TGTGAAGCGC TCAACGATTT GCACAAGCAC AATTTCATAC ACAACGACAT 8450 
AAAACTCGAA AATGTCTTAT ATTTCGAAGC ACTTGATCGC GTGTATGTTT 850 0 
GCGATTACGG ATTGTGCAAA CACGAAAACT CACTTAGCGT GCACGACGGC 8550 
ACGTTGGAGT ATTTTAGTCC GGAAAAAATT CGACACACAA CTATGCACGT 8600 
TTCGTTTGAC TGGTACGCGG CGTGTTAACA TACAAGTTGC TAACCGGCGG 8650 
TTCGTAATCA TGGTCATAGC TGTTTCCTGT GTGAAATTGT TATCCGCTCA 8700 
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CAATTCCACA CAACATACGA GC CGGAAGCA TAAAGTGTAA AGCCTGGGGT 8750 
GCCTAATGAG TGAGCTAACT CACATTAATT GCGTTGCGCT CACTGCCCGC 8800 
TTTCCAGTCG GGAAACCTGT CGTGCCAGCT GCATTAATGA ATCGGCCAAC 8850 
GCGCGGGGAG AGGCGGTTTG CGTATTGGGC GCTCTTCCGC TTCCTCGCTC 8900 
ACTGACTCGC TGCGCTCGGT CGTTCGGCTG CGGCGAGCGG TATCAGCTCA 8950 
CTCAAAGGCG GTAATACGGT TATCCACAGA ATCAGGGGAT AACGCAGGAA 9000 
AGAACATGTG AGCAAAAGGC CAGCAAAAGG CCAGGAACCG TAAAAAGGCC 9050 
GCGTTGCTGG CGTTTTT CCA TAGGCTCCGC CCCCCTGACG AGCATCACAA 9100 
AAATCGACGC TCAAGTCAGA GGTGGCGAAA CCCGACAGGA CTATAAAGAT 9150 
ACCAGGCGTT TCCCCCTGGA AGCTCCCTCG TGCGCTCTCC TGTTCCGACC 92 00 
CTGCCGCTTA CCGGATACCT GTCCGCCTTT CTCCCTTCGG GAAGCGTGGC 9250 
GCTTTCTCAT AGCTCACGCT GTAGGTATCT CAGTTCGGTG TAGGTCGTTC 93 0 0 
GCTCCAAGCT GGGCTGTGTG CACGAACCCC CCGTTCAGCC CGACCGCTGC 9350 
GCCTTATCCG GTAACTATCG TCTTGAGTCC AACCCGGTAA GACACGACTT 94 00 
ATCGCCACTG GCAGCAGCCA CTGGTAACAG GATTAGCAGA GCGAGGTATG 9450 
TAGGCGGTGC TACAGAGTTC TTGAAGTGGT GGCCTAACTA CGGCTACACT 9 500 
AGAAGGACAG TATTTGGTAT CTGCGCTCTG CTGAAGC CAG TTACCTTCGG 955 0 
AAAAAGAGTT GGTAGCTCTT GATCCGGCAA ACAAACCACC GCTGGTAGCG 960 0 
GTGGTTTTTT TGTTTGCAAG CAGCAGATTA CGCGCAGAAA AAAAGGATCT 9650 
CAAGAAGATC CTTTGATCTT TTCTACGGGG TCTGACGCTC AGTGGAACGA 9700 
AAACTCACGT TAAGGGATTT TGGTCATGAG ATTATCAAAA AGGATCTTCA 9750 
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CCTAGATCCT TTTAAATTAA AAATGAAGTT TTAAATCAAT CTAAAGTATA 9800 ' 
TATGAGTAAA CTTGGTCTGA CAGTTACCAA TGCTTAATCA GTGAGGCACC 9850 
TATCTCAGCG ATCTGTCTAT TTCGTTCATC CATAGTTGCC TGACTCCCCG 9900 
TCGTGTAGAT AACTACGATA CGGGAGGGCT TACCATCTGG CCCCAGTGCT 9950 
GCAATGATAC CGCGAGACCC ACGCTCACCG GCTCCAGATT TATCAGCAAT 1000 0 
AAACCAGCCA GCCGGAAGGG CCGAGCGCAG AAGTGGTCCT GCAACTTTAT 10050 
CCGCCTCCAT CCAGTCTATT AATTGTTGCC GGGAAGCTAG AGTAAGTAGT 10100 
TCGCCAGTTA ATAGTTTGCG CAACGTTGTT GCCATTGCTA CAGGCATCGT 10150 
GGTGTCACGC TCGTCGTTTG GTATGGCTTC ATTCAGCTCC GGTTCCCAAC 10200 
GATCAAGGCG AGTTACATGA TCCCCCATGT TGTGCAAAAA AGCGGTTAGC 10250 
TCCTTCGGTC CTCCGATCGT TGTCAGAAGT AAGTTGGCCG CAGTGTTATC 10300 
ACTCATGGTT ATGGCAGCAC TGCATAATTC TCTTACTGTC ATGCCATCCG 10350 
TAAGATGCTT TTCTGTGACT GGTGAGTACT CAACCAAGTC ATTCTGAGAA 10400 
TAGTGTATGC GGCGACCGAG TTGCTCTTGC CCGGCGTCAA TACGGGATAA 10450 
TACCGCGCCA CATAGCAGAA CTTTAAAAGT GCTCATCATT GGAAAACGTT 105 00 
CTTCGGGGCG AAAACTCTCA AGGATCTTAC CGCTGTTGAG ATCCAGTTCG 10550 
ATGTAACCCA CTCGTGCACC CAACTGATCT TCAGCATCTT TTACTTTCAC 106 00 
CAGCGTTTCT GGGTGAGCAA AAACAGGAAG GCAAAATGCC GCAAAAAAGG 10650 
GAATAAGGGC GACACGGAAA TGTTGAATAC TCATACTCTT CCTTTTTCAA 10700 
TATTATTGAA GCATTTATCA GGGTTATTGT CTCATGAGCG GATACATATT 10750 
TGAATGTATT TAGAAAAATA AACAAATAGG GGTTCCGCGC ACATTTCCCC 10800 
GAAAAGTGCC ACCTGACGTC TAAGAAACCA TTATTATCAT GACATTAACC 10850 
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TATAAAAATA GGCGTATCAC GAGGCCCTTT CGTCTCGCGC GTTTCGGTGA 10900 
TGACGGTGAA AACCTCTGAC ACATGCAGCT CCCGGAGACG GTCACAGCTT 10950 
GTCTGTAAGC GGATG CCGGG AGCAGACAAG CCCGTCAGGG CGCGTCAGCG 1100 0 
GGTGTTGGCG GGTGT CGGGG CTGGCTTAAC TATGCGGCAT CAGAGCAGAT 11050 
TGTACTGAGA GTGCACCATA TATGCGGTGT GAAATACCGC ACAGATGCGT 11100 
AAGGAGAAAA TACCGCATCA GGCGCCATTC GCCATTCAGG CTGCGCAACT 11150 
GTTGGGAAGG GCGATCGGTG CGGGCCTCTT CGCTATTACG CCAGCTGGCG 11200 
AAAGGGGGAT GTGCTGCAAG GCGATTAAGT TGGGTAACGC CAGGGTTTT C 11250 
CCAGT CACGA CGTTGTAAAA CGACGGCCAG TGCC 112 84 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 97 amino acids 

(B) TYPE: Amino Acid 
(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ala Lys Phe Leu Val Asn Val Ala Leu Leu Leu Leu Leu Leu 
15 10 15 

Leu Leu Ser Gly Ala Trp Ala His Met Arg Ser His His His His 

20 25 30 

His His Met Pro Ala Gly Ser Trp Asp Pro Ala Gly Tyr Leu Leu 

35 40 45 

Tyr Cys Pro Cys Met Gly Arg Phe Gly Asn Gin Ala Asp His Phe 

50 55 60 

Leu Gly Ser Leu Ala Phe Ala Lys Leu Leu Asn Arg Thr Leu Ala 

65 70 75 
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Val Pro Pro Trp lie Glu Tyr Gin His His Lys Pro Pro Phe Thr 

80 85 90 



Asn Leu His Val Ser Tyr Gin Lys Tyr Phe Lys Leu Glu Pro Leu 

95 100 105 

Gin Ala Tyr His Arg Val lie Ser Leu Glu Asp Phe Met Glu Lys 

110 115 120 

Leu Ala Pro Thr His Trp Pro Pro Glu Lys Arg Val Ala Tyr Cys 

125 130 135 

Phe Glu Val Ala Ala Gin Arg Ser Pro Asp Lys Lys Thr Cys Pro 

140 145 150 

Met Lys Glu Gly Asn Pro Phe Gly Pro Phe Trp Asp Gin Phe His 

155 160 165 

Val Ser Phe Asn Lys Ser Glu Leu Phe Thr Gly lie Ser Phe Ser 

170 175 180 

Ala Ser Tyr Arg Glu Gin Trp Ser Gin Arg Phe Ser Pro Lys Glu 

185 190 195 

His Pro Val Leu Ala Leu Pro Gly Ala Pro Ala Gin Phe Pro Val 

200 205 210 

Leu Glu Glu His Arg Pro Leu Gin Lys Tyr Met Val Trp Ser Asp 

215 220 225 

Glu Met Val Lys Thr Gly Glu Ala Gin lie His Ala His Leu Val 

230 235 240 

Arg Pro Tyr Val Gly lie His Leu Arg lie Gly Ser Asp Trp Lys 

245 250 255 

Asn Ala Cys Ala Met Leu Lys Asp Gly Thr Ala Gly Ser His Phe 

260 265 270 

Met Ala Ser Pro Gin Cys Val Gly Tyr Ser Arg Ser Thr Ala Ala 

275 280 285 
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Pro Leu Thr Met Thr Met Cys Leu Pro Asp Leu Lys Glu He Gin 

290 295 300 



Arg Ala Val Lys Leu Trp Val Arg Ser Leu Asp Ala Gin Ser Val 

305 310 315 

Tyr Val Ala Thr Asp Ser Glu Ser Tyr Val Pro Glu Leu Gin Gin 

320 325 330 

Leu Phe Lys Gly Lys Val Lys Val Val Ser Leu Lys Pro Glu Val 

335 340 345 

Ala Gin Val Asp Leu Tyr He Leu Gly Gin Ala Asp His Phe He 

350 355 360 

Gly Asn Cys Val Ser Ser Phe Thr Ala Phe Val Lys Arg Glu Arg 

365 370 375 

Asp Leu Gin Gly Arg Pro Ser Ser Phe Phe Gly Met Asp Arg Pro 

380 385 390 

Pro Lys Leu Arg Asp Glu Phe 

395 397 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 09 base pairs 

(B) TYPE: Nucleic Acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 



GAACCAGGCC GATCACTTCT TGGGCTCTCT GGCATTTGCA AAGCTGCTAA 50 
ACCGTACCTT GGCTGTCCCT CCTTGGATTG AGTACCAGCA TCACAAGCCT 10 0 
CCTTTCACCA ACCTCCATGT GTCCTACCAG AAGTACTTCA AGCTGGAGCC 150 
CCTCCAGGCT TACCATCGGG TCATCAGCTT GGAGGATTTC ATGGAGAAGC 20 0 
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TGGCACCCAC CCACTGGCCC CCTGAGAAGC GGGTGGCATA CTGCTTTGAG 250 
GTGGCAGCCC AGCGAAGCCC AGATAAGAAG ACGTGCCCCA TGAAGGAAGG 300 
AAACCCCTTT GGCCCATTCT GGGATCAGTT TCATGTGAGT TTCAACAAGT 350 
CGGAGCTTTT TACAGGCATT TCCTTCAGTG CTTCCTACAG AGAACAATGG 400 
AGCCAGAGAT TTTCTCCAAA GGAACATCCG GTGCTTGCCC TGCCAGGAGC 450 
CCCAGCCCAG TTCCCCGTCC TAGAGGAACA CAGGCCACTA CAGAAGTACA 5 00 
TGGTATGGTC AGACGAAATG GTGAAGACGG GAGAGGCCCA GATTCATGCC 550 
CACCTTGTCC GGCCCTATGT GGGCATTCAT CTGCGCATTG GCTCTGACTG 6 00 
GAAGAACGCC TGTGCCATGC TGAAGGACGG GACTGCAGGC TCGCACTTCA 650 
TGGCCTCTCC GCAGTGTGTG GGCTACAGCC GCAGCACAGC GGCCCCCCTC 70 0 
ACGATGACTA TGTGCCTGCC TGACCTGAAG GAGATCCAGA GGGCTGTGAA 750 
GCTCTGGGTG AGGTCGCTGG ATGCCCAGTC GGTCTACGTT GCTACTGATT 800 
CCGAGAGTTA TGTGCCTGAG CTCCAACAGC TCTTCAAAGG GAAGGTGAAG 850 
GTGGTGAGCC TGAAGCCTGA GGTGGCCCAG GTCGACCTGT ACATCCTCGG 900 
CCAAGCCGAC CACTTTATTG GCAACTGTGT CTCCTCCTTC ACTGCCTTTG 950 
TGAAGCGGGA GCGGGACCTC CAGGGGAGGC CGTCTTCTTT CTTCGGCATG 100 0 
GACAGGCCCC CTAAGCTGCG GGACGAGTTC TGATTCTGGC CGGAGCACCA 105 0 
GACCCTCTGA TCCTGGAGGG ACCAGAGTCT GAGCTGGTCC TTCCAGCCAG 110 0 
GCCTGGCAGC CAGAGGTGCT CCGGGATTGC AAACTCCTCT TCTCACCTGC 115 0 
CAAAGATGGA GAAGAGTGCC AGGGACCCCT CAAGGAGGGA GACGCTCCAT 120 0 
ATCCCAGGGC ATAGGACTTG CAGGTTCCTA GGAGCAGGAG CATCTCCCAT 1250 
CGCACGTGCT TTCTGCTCTT CTGGGAATTT CTCACACTGG CAAAGCAGTC 13 0 0 
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CAGCCTCCGT CTTCTGGTCC ACTCTGCTCT GAGCAGCCTG GGATGCTGAA 1350 
CTCTTCAGAG AGATTTTTTT ATAGAGAGAT TTCTATAATT TTGATACAAG 14 00 
GTCATGACTA TCCTAGAACT CTCTGTGGTT TTTGAAAATC ATTGAATTCT 1450 
ATTAATGTAG GTACCTAAAG TGACCTTAAC TGAATGTGGA TGAGGCTGGG 1500 
GCTGGTGTGG GTCTTTTGGC TGCTTTTCAA GGTGTCCCCC AATGTGGCCC 1550 
TCAAGAGCCA TCCCCACTGC CTGGCCAGAG CCATTGTTGT CCCCTACTTC 1600 
CTAGGCCATT TCTGGGGCTT GGGGGATGAA TGCTGTCGTG TGCTGTAAAC 1650 
ACTATGCAAA TGGAAGTTAT CGGTTGTGGT GCTGTGCAGC GCTCTGTGGG 1700 
CGACTAAGTG CCACTCACGC AGCATGTTCC TGGCAAGGAG CACATACCAT 1750 
CAAGCCACAC TATCATGGTA TTGTTCTCAC AGTCTTTTGG TGGTTGATGG 1800 
CCACTGCAAA CCTGGCACCA TCAGATCTCT TCTGATCTCT TGCCCCAGTG 1850 
GGGCCTGGTT GGTAGAATGT TGGCATTCGG TTGATATCCA AAGCCTGTTC 1900 
TCCCAGCCGT CCTCCTGCAG CTGGAGCCTT CAGGCCGTAT TCTCACGAGG 1950 
GAACGTTTGC CAAGGCTCTG ACCTCACAGA AGATGCCCAG GGCCCAGAAG 200 0 
CCATCAGAAT TATCAGTGGA GAAGCACCTT TTGACTCTTC CCTTCCAATG 2050 
TAATCTCTGC CAACACCATG AGGCTTAAGG TGCTCTAAGT CATGAGTGTT 2100 
TTGGTCTCAA ATGCTGCAGT TTTAATAATC TGTGACTCCT GAGAGC CCAT 215 0 
GGTTTTTTGA CCTTGTGGTT CTAAAATTCC TTGTCTGACC CCTGTAGATC 2200 
TTTTCCTTGC CATGTCACCT CCCTTGGCCT TTGATCCTGG AAAGGTGGCA 225 0 
GAGCCTCCAC TGAGCCAGGC CCAGAGCTCC TTGCAGTGCC TTCTTCCTTG 23 0 0 
TTTACCTGTG GGAGGAAACA CTTTTTTTGT CAGGGGCAGC CTGGTTCAGA 2350 
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GCTCAGAGGT CACACTGTAT CAAAGATCTC AAACAGCAAA GTCAGCATTT 2400 
GCTGTATAGA GCTGCCACCC AACTCTAAGC AGGAGAAACT GTACAGAAAG 2450 
GGCTTTGCTA TTTTTCCCTT TTGGGAAAAC AATGAAGTGT TTTAAGTCCT 2500 
GGGTGGACTG AGAGATGGTT TGCCTGTCCA GACTTGCTCT CAAGCCTCAT 2550 
CCAGAGAAGG AGCTGCAGAT GAGGGAGCCC GTACACTCCC TGCCACCACT 2600 
AGGTTGTAAG CCTGTAGCTG GCTGGCTGAT TTCATTTTGG AATTCATTTG 2650 
CCATCCACAG CCTTACACTA GGCACACACT TTAGAGTCTG GGGCTCCAGT 2700 
GGGGCCCGCC TAATTTTTTT TCCCCCCAAG ACAGGGCCTT GCTCTGTCTC 2750 
CCAGGCTGGA GTGCAGTGGC ATGATCATGG CTTACTGCAG CCTTGATCTC 2800 
CCAGGCTCAA GCGATCCTTC TGCCTCAGCC TCTCTGGTAG CTGAGACTGC 2850 
ATGCCCAGCT CCAAATCACC TTGATT CAT A TCAGCAGTAA TAATCACTTG 2900 
TGTTCTGAAA GAAAGGGCAC CAGAAGTTCT AGCAAAATTC AGTTGTGTTC 2950 
TGTGAGCTAG CACTTTTTCC TCTGAC C C AA TTTTCTTACC TATAAAATGG 300 0 
TGATAAAAAC CGACAGGTTG TTCAAAGGCC CAGATCAGCT AAAGCATGTA 3050 
TATAAGAGCA CGTTGTAAAC TTGAAAGAGA CAAAGGCACA AATGTGGCTG 3100 
TTGATTAATT TGACTGCTTC TCGTTGCTCG TCACCTCCAT GCCAGGCACT 3150 
GTGCTTGCTA ATTGCTTTAT GGGGGCATTC TCTTATTTAT TCCCCAGCCC 3200 
TGGGAAATAG GAGCTGTCAT TATCCTTCTC TTTCTGCACA AGGAAAAATT 3250 
AATGCCCTGA GAATTGTCAT AATTTTCCCA AGGCTGCCCA GCTGGTGGTG 330 0 
TTAAGCCAGA ATTTGACCTC CCAGAGCCAG TTTCCATTAG CTGCCATGCT 3350 
CTGCTGCCTC TAATTCACAG AATGCACTTT CTACCCTGTG TGCCATGGAG 34 00 
ACCTCCTATG GAAAAATGAT CAGCCACCTT ACCTTCTACT GGGTACCTGC 3450 
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TGTGAGTCTG CCTATGCCAG AAGGATTAAG GAGGGGAGGT TAC CCAAGAA 3500 
ACAAAGCCTA CATGCCGCTT ACAGCCCCCG TTGGATGGTT GCTCAGTACA 3550 
ACAGTCTTGC ATTCAGCAGG TGTTTGTTCA TCACCTACTA TGTGTCAGGC 36 00 
TCTATGCTAG GTACTGGGGA TACAGGAGAG AATCAAGCGT AAAGTCTTTG 3650 
TTCTCAAGGA ATTTGCATTC TAGAAAGTAG AAGATGTAAT AAATGTACTG 3 700 
TGGGACATGT TAATAAGTGC TATAAAGAAA TATAAAGGGT TTGGGAGCAA 3750 
AAAGAGGGAG TGGATCTATT TTAGATGAGC CCAGGTAAGA CCTCTCTGAA 3800 
GAGCTGTCAT GAAGGAGGGA GGGAGCACAT TCCTGGCAGA GAAAACAGCA 3850 
CGTGCAAAGG CCCCGAGACT GGAGTGTGTT CCTGAAGAGC AGCCAGGAGG 3900 
CCAGCATGGC TGGAGAGGCA GGCATAGGCA GGGAACCGAG CAGCAGGTCA 3 950 
GAGCAGGCGA GCTGACATTC TGCAGCCTGG ACGGCCATGG CAGGAAGCTT 4000 
TTAGTTGGAG AGATACAGGA AGCCTCCTAG GGTTCTGAGC AGAAGAGGGG 4 050 
CATGAGCTGA TTCACATTCT GAAGGACCTC TCTAGCTGGC CAGTGCTGAG 4100 
GAGGTTGGAG AGAGAAAGGG TGAAAGCAGA GAGAC CAGTG CAGGGCTGTT 4150 
AACAGGGTTG CAGGCGAGAG ACTGGGGTGC TGGGCTCCCC TAGACTAGGA 4 200 
CTCCAGTGCC CTCCTCTCCC AAGAGACAAA GGCCATTGCA TTGAAGGAGG 4250 
TGGGAAATGA TTAGATTCTG AACATATGTA ATTATTTTTC AGTCTTTTTC 43 00 
AAAGATACAA ATATTTACAT AGTTTTAATC ATGTAATATA TACAATTTAA 4 3 50 
TGTCCTAGTG TTTTACTTAA TAGTGTATCA TGTTTTCCCT GTTGGTATGT 4400 
AGCCTGGATA AATGCTCTTA ATTATAAAAA ATTCTGTCGA GGAGTGTTCC 44 50 
ATAGTTTATT GTTTTCCTAT TATGAGAATT TAGGCCAAGT GTGGTGGCTC 4500 
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ATGCCTGTAA TCCCAGCACT TTGCGAGGCC GAGGTGGGCA GATCACTTGA 4550 
GGTGAGGAGT TCAAGACCAG CCTGGCCAAC ATGGTGAATT ATCTCTACTA 4600 
AAAATACAAA AAAATAATAA TAATAGCCAG GCGTGGTGGC ACATGCCTGT 4650 
ATTCCCAGCT GCTTGGGAGG CTGAGGCAGG AGAATGGCTT GAACCTGGGA 4700 
GGTGGAGGTT GCAGTGAGCC GAGATGGTGC CACTGCATTC CAGCCTGGGC 4750 
AACAGAGCGA GACTCCATCT CAAAAAAAAG GAGACTTCAT GTGCCCCCAA 4800 
TTTTTCACTA TTGTTATTTG AAAAAATATT TTTATTTGTA AGAGTTTTTC 4850 
TTTATTTAAA ATGTTCATTA ATAAAGTTGT TGGACGGGAA GCAAAAAAAA 4900 
AAAGTTGTTT AAGATAAATT CCCAGAAGTG AATTTGTTAG ATCAAACACT 4 950 
TAAAACTTTT TGTTATGGAA GAATTCAAAT ATAAATAAAA AATTGTGAGT 5000 
AATAAAATG 5009 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 474 amino acids 

(B) TYPE: Amino Acid 
(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Ser Asn Tyr Arg Tyr Ser Lys Leu Asn Glu Glu Glu lie Ser 
15 10 15 

Leu Glu Asp Met Pro Ser Ser Ala Asn Gin lie Leu Thr Arg Gin 

20 25 30 

Glu Gin lie lie Gin Glu Gin Asp Asp Glu Leu Glu Leu Val Gly 

35 40 45 

Asn Ser Val Arg Thr Leu Arg Gly Met Ser Ser Met He Gly Asp 

50 55 60 

84 



Glu Leu Asp Gin Gin Ser Thr Met Leu Asp Asp Leu Gly Gin Glu 

65 70 75 



Met Glu Tyr Ser Glu Thr Arg Leu Asp Thr Ala Met Lys Lys Met 

80 85 90 

Ala Lys Leu Thr His Leu Glu Asp Gly Met Leu Leu Ala Arg Arg 

95 100 105 

He Val Gin Ser Met Gin Asn Asp His Gly Ala Leu Ser Ser Pro 

110 H5 120 

Val Phe Pro Arg Leu Cys Pro Ser Gly Leu Thr Thr Tyr Val Pro 

125 130 135 

Tyr He Val Asp Phe Ser Ser Leu Thr Phe His He Phe He He 

140 145 150 

He He He He He He Asp Phe Cys Ser Gin Ser Gin Ser Lys 

155 160 165 

Gly Arg Phe Gly Asn Gin Val Asp Gin Phe Leu Gly Val Leu Ala 

170 175 180 

Phe Ala Lys Ala Leu Asp Arg Thr Leu Val Leu Pro Asn Phe He 

185 190 195 

Glu Phe Lys His Pro Glu Thr Lys Met He Pro Phe Glu Phe Leu 

200 205 210 

Phe Gin Val Gly Thr Val Ala Lys Tyr Thr Arg Val Val Thr Met 

215 220 225 

Gin Glu Phe Thr Lys Lys He Met Pro Thr His Phe Val Gly Thr 

230 235 240 

Pro Arg Gin Ala He Tyr Asp Lys Ser Ala Glu Pro Gly Cys His 

245 250 255 

Ser Lys Glu Gly Asn Pro Phe Gly Pro Tyr Trp Asp Gin He Asp 

260 265 270 
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Val Ser Phe Val Gly Asp Glu Tyr Phe Gly Asp He Pro Gly Gly 

275 280 285 



Phe Asp Leu Asn Gin Met Gly Ser Arg Lys Lys Trp Leu Glu Lys 

290 295 300 

Phe Pro Ser Glu Glu Tyr Pro Val Leu Ala Phe Ser Ser Ala Pro 

305 310 315 

Ala Pro Phe Pro Ser Lys Gly Lys Val Trp Ser He Gin Lys Tyr 

320 325 330 

Leu Arg Trp Ser Ser Arg He Thr Glu Gin Ala Lys Lys Phe He 

335 340 345 

Ser Ala Asn Leu Ala Lys Pro Phe Val Ala Val His Leu Arg Asn 

350 355 360 

Asp Ala Asp Trp Val Arg Val Cys Glu His He Asp Thr Thr Thr 

365 370 375 

Asn Arg Pro Leu Phe Ala Ser Glu Gin Cys Leu Gly Glu Gly His 

380 385 390 

His Leu Gly Thr Leu Thr Lys Glu He Cys Ser Pro Ser Lys Gin 

395 400 405 

Gin He Leu Glu Gin He Glu Ala His Arg Gin Glu Pro Asp Asp 

410 415 420 

Met Tyr Thr Ser Leu Ala He Met Gly Arg Ala Asp Leu Phe Val 

425 430 435 

Gly Asn Cys Val Ser Thr Phe Ser His He Val Lys Arg Glu Arg 

440 445 450 

Asp His Ala Gly Gin Ser Pro Arg Pro Ser Ala Phe Phe Gly He 

455 460 465 

Arg Ala Val Lys Arg His He Asp Leu 

470 474 
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(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 amino acids 

(B) TYPE: Amino Acid 
(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

Met Pro Ala Gly Ser Trp Asp Pro Ala Gly Tyr Leu Leu Tyr Cys 
15 10 15 

Pro Cys Met Gly Arg Phe Gly Asn Gin Ala Asp His Phe Leu Gly 

20 25 30 

Ser Leu Ala Phe Ala Lys Leu Leu Asn Arg Thr Leu Ala Val Pro 

35 40 45 

Pro Trp lie Glu Tyr Gin His His Lys Pro Pro Phe Thr Asn Leu 

50 55 60 

His 
61 

(2) INFORMATION FOR SEQ ID NO:10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: Nucleic Acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



CTTCTTGGGC TCTCTGGCAT TTGCAAAGCT GCTAAAC CGT 4 0 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: Nucleic Acid 

(C) STRANDEDNESS: Single 
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(D) TOPOLOGY: Linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



TTCGACGATT TGGCATGGAA CCGACAGGGA GGAACCTAAC 4 0 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: Nucleic Acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



CTTCTTGGGC TCTCTGGCAT TTGCAAAGCT GCTAAACCGT 4 0 
(2) INFORMATION FOR SEQ ID NO : 13 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: Nucleic Acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 
TCCCTGGGGA GTTCCTCCCT CTGCGAGGTA 30 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: Amino Acid 
(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 14 : 

Arg Ser His His His His His His Met Pro Ala Gly Ser Trp Asp 
15 10 15 
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Pro Ala Gly Tyr Leu Leu Tyr Xaa Pro Xaa Met Gly Arg 

20 25 28 



(2) INFORMATION FOR SEQ ID NO: 15: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: Amino Acid 
(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Thr Val Asp Gly Asp Gin Cys Glu Ser Asn Pro Cys Leu Asn Gly 
15 10 15 

Gly Ser Cys Lys Asp Asp lie Asn Ser Tyr Glu Cys Trp Cys Pro 

20 25 30 

Phe Gly Phe Glu Gly Lys Asn Cys Glu Leu Asp Val Thr His His 

35 40 45 

His His His His Gly Ser Ala 

50 52 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1100 base pairs 

(B) TYPE: Nucleic Acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



ATGCCCGCGG GCTCCTGGGA CCCGGCCGGT TACCTGCTCT ACTGCCCCTG 5 0 
CATGGGGCGC TTTGGGAACC AGGCCGATCA CTTCTTGGGC TCTCTGGCAT 100 
TTGCAAAGCT GCTAAACCGT ACCTTGGCTG TCCCTCCTTG GATTGAGTAC 150 
CAGCATCACA AGCCTCCTTT CACCAACCTC CATGTGTCCT ACCAGAAGTA 2 00 
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CTTCAAGCTG GAGCCCCTCC AGGCTTACCA TCGGGTCATC AGCTTGGAGG 250 
ATTTCATGGA GAAGCTGGCA CCCACCCACT GGCCCCCTGA GAAGCGGGTG 300 
GCATACTGCT TTGAGGTGGC AGCCCAGCGA AGCCCAGATA AGAAGACGTG 3 50 
CCCCATGAAG GAAGGAAACC CCTTTGGCCC ATTCTGGGAT CAGTTTCATG 400 
TGAGTTTCAA CAAGTCGGAG CTTTTTACAG GCATTTCCTT CAGTGCTTCC 450 
TACAGAGAAC AATGGAGCCA GAGATTTTCT CCAAAGGAAC ATCCGGTGCT 500 
TGCCCTGCCA GGAGCCCCAG CCCAGTTCCC CGTCCTAGAA GAACACAGGC 550 
CACTACAGAA GTACATGGTA TGGTCAGACG AAATGGTGAA GACGGGAGAG 600 
GCCCAGATTC ATGCCCACCT TGTCCGGCCC TATGTGGGCA TTCATCTGCG 65 0 
CATTGGCTCT GACTGGAAGA ACGCCTGTGC CATGCTGAAG GACGGGACTG 70 0 
CAGGCTCGCA CTTCATGGCC TCTCCGCAGT GTGTGGGCTA CAGCCGCAGC 750 
ACAGCGGCCC CCCTCACGAT GACTATGTGC CTGCCTGACC TGAAGGAGAT 8 00 
CCAGAGGGCT GTGAAGCTCT GGGTGAGGTC GCTGGATGCC CAGTCGGTCT 8 50 
ACGTTGCTAC TGATTCCGAG AGTTATGTGC CTGAGCTCCA ACAGCTCTTC 900 
AAAGGGAAGG TGAAGGTGGT GAGCCTGAAG CCTGAGGTGG CCCAGGTCGA 950 
CCTGTACATC CTCGGCCAAG CCGACCACTT TATTGGCAAC TGTGTCTCCT 10 0 0 
CCTTCACTGC CTTTGTGAAG CGGGAGCGGG ACCTCCAGGG GAGGCCGTCT 1050 
TCTTTCTTCG GCATGGACAG GCCCCCTAAG CTGCGGGACG AGTTCTGATT 110 0 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 343 amino acids 

(B) TYPE: Amino Acid 
(D) TOPOLOGY: Linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



Asn Gin Ala Asp His Phe Leu Gly Ser Leu Ala Phe Ala Lys Leu 

15 10 15 

Leu Asn Arg Thr Leu Ala Val Pro Pro Trp lie Glu Tyr Gin His 

20 25 30 

His Lys Pro Pro Phe Thr Asn Leu His Val Ser Tyr Gin Lys Tyr 

35 40 45 

Phe Lys Leu Glu Pro Leu Gin Ala Tyr His Arg Val lie Ser Leu 

50 55 60 

Glu Asp Phe Met Glu Lys Leu Ala Pro Thr His Trp Pro Pro Glu 

65 70 75 

Lys Arg Val Ala Tyr Cys Phe Glu Val Ala Ala Gin Arg Ser Pro 

80 85 90 

Asp Lys Lys Thr Cys Pro Met Lys Glu Gly Asn Pro Phe Gly Pro 

95 100 105 

Phe Trp Asp Gin Phe His Val Ser Phe Asn Lys Ser Glu Leu Phe 

110 115 120 

Thr Gly lie Ser Phe Ser Ala Ser Tyr Arg Glu Gin Trp Ser Gin 

125 130 135 

Arg Phe Ser Pro Lys Glu His Pro Val Leu Ala Leu Pro Gly Ala 

140 145 150 

Pro Ala Gin Phe Pro Val Leu Glu Glu His Arg Pro Leu Gin Lys 

155 160 165 

Tyr Met Val Trp Ser Asp Glu Met Val Lys Thr Gly Glu Ala Gin 

170 175 180 

lie His Ala His Leu Val Arg Pro Tyr Val Gly lie His Leu Arg 

185 190 195 

lie Gly Ser Asp Trp Lys Asn Ala Cys Ala Met Leu Lys Asp Gly 

200 205 210 
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Thr Ala Gly Ser His Phe Met Ala Ser Pro Gin Cys Val Gly Tyr 

215 220 225 



Ser Arg Ser Thr Ala Ala Pro Leu Thr Met Thr Met Cys Leu Pro 

230 235 240 

Asp Leu Lys Glu lie Gin Arg Ala Val Lys Leu Trp Val Arg Ser 

245 250 255 

Leu Asp Ala Gin Ser Val Tyr Val Ala Thr Asp Ser Glu Ser Tyr 

260 265 270 

Val Pro Glu Leu Gin Gin Leu Phe Lys Gly Lys Val Lys Val Val 

275 280 285 

Ser Leu Lys Pro Glu Val Ala Gin Val Asp Leu Tyr lie Leu Gly 

290 295 300 

Gin Ala Asp His Phe lie Gly Asn Cys Val Ser Ser Phe Thr Ala 

305 310 315 

Phe Val Lys Arg Glu Arg Asp Leu Gin Gly Arg Pro Ser Ser Phe 

320 325 330 

Phe Gly Met Asp Arg Pro Pro Lys Leu Arg Asp Glu Phe 

335 340 343 
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